Bump-and-hole engineering of human polypeptide N-acetylgalactosamine transferases to dissect their protein substrates and glycosylation sites in cells

Summary Despite the known disease relevance of glycans, the biological function and substrate specificities of individual glycosyltransferases are often ill-defined. Here, we describe a protocol to develop chemical, bioorthogonal reporters for the activity of the GalNAc-T family of glycosyltransferases using a tactic termed bump-and-hole engineering. This allows identification of the protein substrates and glycosylation sites of single GalNAc-Ts. Despite requiring transfection of cells with the engineered transferases and enzymes for biosynthesis of bioorthogonal substrates, the tactic complements methods in molecular biology. For complete details on the use and execution of this protocol, please refer to Schumann et al. (2020)1, Cioce et al. (2021)2, and Cioce et al. (2022)3

1. Inspect the sequence of the GalNAc-T to identify the location of the TM domain, the catalytic domain and the lectin domain. a. The sequence features in the UniProt database can be used to facilitate this process. 2. Design primers so that upon PCR amplification from cDNA sources the N-terminus of the protein is positioned immediately C-terminal to the TM domain and the catalytic and lectin domains are intact. a. The truncated sequence can then be cloned into an expression vector specific for the desired host system.
Note: Multiple strategies have been performed to successfully express soluble forms of all 20 GalNAc-Ts (Table 1). In the original publication 1 a secretion construct of BH GalNAc-T2 was designed according to literature precedent of crystallization of the wild-type (WT) enzyme 5 and then cloned with a His 6 tag into a pOPING vector. Newer renditions have used pGEN2-DEST constructs ( Figure 3) developed by Moremen et al. 30 The provision of these vectors is an invaluable advance to the field. Vectors are commercially available in DNASU (https://dnasu.org/ DNASU/Home.do) and allow secreted expression of most truncated GalNAc-Ts in mammalian cells (except GalNAc-T8, -T17, -T19 and -T20). Nevertheless, the protein expression levels produced with these constructs can vary. For this reason, alternative constructs may be required, and the expression strategies and organisms optimized.
a. Sonicate, taking care not to warm. 13. Prepare 5 mM solutions of UDP-GalNAc and compound 1 in MilliQ water. Store at À20 C or À80 C.  30 The BH GalNAc-T7 version of this construct was successfully expressed, purified and used in in vitro glycosylation experiments 41 14. Prepare a 1 mM stock solution of WT and BH GalNAc-T in 13 Buffer.
CRITICAL: MnCl 2 can be hazardous to human health. Always read the corresponding Material Safety Data Sheet (MSDS) before using this reagent and always handle it with care using appropriate personal protective cover (e.g., lab coat, gloves and safety goggles). Prepare the 1 mM stock solution of enzyme in 13 Buffer immediately before use and only prepare as much as you will need for the experiment. The enzyme cannot be stored in the 13 Buffer so any solution that is not used in the experiment should be discarded. The enzyme should always be handled on ice. Freeze-thawing cycles may affect the integrity of the enzyme. It is advisable to prepare aliquots with the volume required for the experiment to keep the freeze-thawing cycles to a minimum.
Note: WT and BH GalNAc-T should be stable for months when stored at À80 C in Freezing Buffer. The peptide and UDP-sugar solutions can be stored for several months at À20 C but freeze-thawing may also affect their stability. It is therefore advisable to prepare aliquots with the volume required for the experiment to keep freeze-thawing cycles of the stock solutions to a minimum. Since MnCl 2 can become oxidized a fresh solution should be made after 2 weeks to 1 month or before that if the solution turns brown.

Michaelis-Menten kinetics
Timing: 20 min  CRITICAL: MnCl 2 can be hazardous to human health. Always read the corresponding MSDS before using this reagent and always handle it with care using appropriate personal protective cover (e.g., lab coat, gloves and safety goggles). Prepare the 103 stock solutions of enzyme in 13 Buffer immediately before use and only prepare as much as you will need for the experiment. The enzyme cannot be stored in the 13 Buffer so any solution that isn't used in the experiment should be discarded. The enzyme should always be handled on ice. Freeze-thawing cycles may affect the integrity of the enzyme. It is advisable to prepare aliquots with the volume required for the experiment to keep the freezethawing cycles to a minimum.
Note: WT and BH GalNAc-T should be stable for months when stored at À80 C in Freezing Buffer. The peptide and UDP-sugar solutions can be stored for several months at À20 C but freeze-thawing may also affect their stability. It is therefore advisable to prepare aliquots with the volume required for the experiment to keep freeze-thawing cycles of the stock solutions to a minimum. Since MnCl 2 can become oxidized a fresh solution should be made after 2 weeks to 1 month or before that if the solution turns brown.
In-Fusion Cloning of full-length WT and BH GalNAc-T Timing: 5 min 21. Design the PCR primers with 15 base pair (bp) extensions which are complementary to the ends of the pSBbi vector. a. We use the In-Fusion Cloning action in SnapGene to simplify PCR primer design (https:// www.snapgene.com/resources/in-fusion-cloning/?referrer=SnapGene). b. In the Vector tab, open the file containing the desired pSBbi-based plasmid and select the option to linearize the vector with the SfiI restriction enzyme. c. In the Fragment tab, open the file containing the desired WT or BH GalNAc-T sequence.
i. Highlight the region to be inserted.
ii. Select the option to use the fragment as a template for PCR.
iii. It is crucial to switch the orientation of the fragment by selecting the arrow pointing towards the left due to the nature of the pSBbi-based plasmids. d. In the Product tab, select the option to Choose Overlapping PCR Primers to allow SnapGene to design the most suitable primers for the experiment. i. Tick the option to regenerate the upstream and downstream SfiI sites from the Vector.
ii. SnapGene will provide the primer sequences to be used in the In-Fusion cloning reaction and the sequence of the final product. e. Modify the sequence of the reverse primer provided by SnapGene so that a VSV-G tag is cloned at the C-terminus of the GalNAc-T followed by a stop codon.
i. For example, the forward and reverse primers used for GalNAc-T2 were AAAGGCCTCTGA GGCCACCATGCGGCGGCGCGCTCG and TTTGGCCTGACAGGCCCTACTTACCCAGG CGGTTCATTTCGATATCAGTGTACTGCTGCAGGTTGAGCGGTG respectively (VSV-G tag underlined). 1 22. Order the primers. 23. Prepare 10 mM solutions of the forward and reverse primers in nuclease-free water. Store at À20 C or À80 C. a. The primer solutions should be stable for months when stored at À20 C or À80 C.
CRITICAL: Reagents such as hygromycin B, penicillin and streptomycin can be hazardous to human health. Always read the corresponding MSDS before using these reagents and always handle them with care using appropriate personal protective cover (e.g., lab coat, gloves and safety goggles).
Note: The growth medium with and without hygromycin B should be prepared fresh for the experiment and stored at 4 C for maximum a month, although it should be visually inspected every time it is used for contamination. If any contamination is observed the medium should be discarded and a fresh solution prepared.

Cell surface labeling experiments
Timing: 20 min 27. Prepare 50 mM solutions of compound 2 and Ac 4 ManNAl in DMSO. Store at À80 C.
a. The 50 mM solutions of compound 2 and Ac 4 ManNAl should be stable for months when stored at À80 C. 28. Prepare a 2% FBS in 13 PBS solution. Store at 4 C.
a. The 2% FBS in 13 PBS should be stored at 4 C for maximum a month, although it should be visually inspected every time it is used for contamination. b. If any contamination is observed the medium should be discarded and a fresh solution prepared. 29. Prepare a 50 mM BTTAA solution in MilliQ water. Store at À20 C.
a. The 50 mM BTTAA solution should be stable for months when stored at À20 C. 30. Prepare a 10 mM CF680 picolyl azide solution in DMSO. Store at À20 C.
a. The 10 mM CF680 picolyl azide solution should be stable for months when stored at À20 C. 31. Prepare a fresh 30 mM CuSO 4 solution in MilliQ water. 32. Prepare a fresh 100 mM sodium ascorbate solution in MilliQ water. 33. Prepare a fresh 100 mM aminoguanidinium chloride solution in 13 PBS. 34. Prepare a fresh 23 copper-catalyzed azide-alkyne cycloaddition (CuAAC) solution I as described in the ''materials and equipment'' section. 35. Prepare quenching solution: 3 mM bathocuproinedisulfonic acid in 13 PBS. Store at À20 C.
a. The quenching solution should be stable for months when stored at À20 C. 36. Prepare fresh lysis buffer I as described in the ''materials and equipment'' section. 37. Prepare fresh diluted PNGase F solution: 1:10 (v/v) dilution in PBS.
a. The fixing solution should be stable for months when stored at 20 C-25 C.
CRITICAL: Reagents such as aminoguanidinium chloride, CuSO 4 , DTT, and SDS can be hazardous to human health. Always read the corresponding MSDS before using these reagents and always handle them with care using appropriate personal protective cover (e.g., lab coat, gloves and safety goggles). Acetic acid and methanol are flammable and hazardous to human health. Keep these solvents away from ignition sources and always handle them in a chemical fume hood using appropriate personal protective cover. Prepare the diluted PNGase F solution immediately before use and only prepare as much as you will need for the experiment. Any solution that isn't used in the experiment should be discarded. The enzyme should always be handled on ice.
Note: The 30 mM CuSO 4 , the 100 mM sodium ascorbate and the 100 mM aminoguanidinium chloride solutions should be made fresh for each experiment. Any solution that isn't used in the experiment should be discarded. PNGase F stock should be stable for months when stored at 4 C. a. Transfer beads to 15 mL Greiner tubes. b. Place the beads in a magnetic rack. c. Collect supernatant and store at 4 C until later (this buffer will be used once the methylation is complete to store the beads). d. Wash the beads three times with 10 mL of PBS-T. e. Place the beads in a magnetic rack. CRITICAL: Reagents such as aminoguanidinium chloride, AmBic, CuSO 4 and DTT can be hazardous to human health. Always read the corresponding MSDS before using these reagents and always handle them with care using appropriate personal protective cover (e.g., lab coat, gloves and safety goggles). Acetonitrile, formaldehyde, formic acid, methanol and sodium cyanoborohydride are flammable and hazardous to human health. Keep these solvents away from ignition sources and always handle them in a chemical fume hood using appropriate personal protective cover.

Sample preparation for mass spectrometry (MS)-proteomics and glycoproteomics
Note: Since the original publication, a new compound (Ac 4 GalN6yne, compound 3) and a new construct containing an additional biosynthetic enzyme (N-acetylhexosamine 1-kinase (NahK) from Bifidobacterium longum) have been introduced that allow biosynthesis of compound 1 ( Figure 5) in a more straightforward manner, as compound 3 is synthetically much more ll OPEN ACCESS accessible than compound 2 3 . Furthermore, the GalNAc-T-specific glycoprotein labeling and glycoproteomics workflow has been optimized to adherent MCF7 cells. 3 The 30 mM CuSO 4 , the 200 mM sodium ascorbate and the 200 mM aminoguanidinium chloride solutions should be made fresh for each experiment. Any solution that isn't used in the experiment can be discarded. The Reagent A and B solutions should be made fresh for the experiment and any solution that isn't used should be discarded in the appropriate waste. The 0.1 mg/mL Lys-C and trypsin solutions in LC/MS-grade water should be stable for months when stored at À80 C.
Since freeze-thawing cycles may affect the integrity of the enzymes, it is advisable to prepare aliquots with the volume required for the experiment and discard any solution that is not used. The 40% acetonitrile in LC/MS-grade water, 1% (v/v) formic acid in LC/MS-grade water, the conditioning solvent, the loading buffer, the wash buffer and the elution buffer should all be made fresh and stored at 4 C for maximum a week since acetonitrile and formic acid are volatile. These solutions should all be prepared in new clean glass vials since detergents and plastic leaching can cause background signal in the mass spectrometer and interfere with the analysis of the samples.

Mass spectrometry data analysis
Timing: 10 min 69. Prepare a FASTA file containing the relevant protein sequences for your sample proteomes. a. If using human cells, download the Homo sapiens database from Uniprot or Swissprot. (C) BH GalNAc-T, mut-AGX1 and NahK co-expression construct required for this strategy to work in cells.

MATERIALS AND EQUIPMENT
CRITICAL: Reagents such as aminoguanidinium chloride and CuSO 4 can be hazardous to human health. Always read the corresponding MSDS before using these reagents and always handle them with care using appropriate personal protective cover (e.g., lab coat, gloves and safety goggles). The 23 CuAAC solution I should be prepared by adding the components in the order of the table.
CRITICAL: Reagents such as Halt Protease Inhibitors, Triton X-100, sodium deoxycholate and SDS can be hazardous to human health. Always read the corresponding MSDS before using these reagents and always handle them with care using appropriate personal protective cover (e.g., lab coat, gloves and safety goggles). The benzonase nuclease and Halt protease inhibitors must be added to the lysis buffer immediately before adding the lysis buffer to the cells. Only prepare as much lysis buffer I as you will need for the experiment. Any solution that isn't used in the experiment should be discarded.
Note: Lysis buffer containing all reagents except benzonase nuclease and Halt Protease inhibitors can be prepared and stored at 4 C in the dark (since sodium deoxycholate is light sensitive) for a couple of months or until some precipitate is observed. CRITICAL: Reagents such as Halt Protease Inhibitors, Triton X-100, sodium deoxycholate and SDS can be hazardous to human health. Always read the corresponding MSDS before using these reagents and always handle them with care using appropriate personal protective cover (e.g., lab coat, gloves and safety goggles). The benzonase nuclease, Halt protease inhibitors and PUGNAc must be added to the lysis buffer immediately before adding the lysis buffer to the cells. Only prepare as much lysis buffer II as you will need for the experiment. Any solution that isn't used in the experiment should be discarded.
Note: Lysis buffer containing all reagents except benzonase nuclease, Halt Protease inhibitors and PUGNAc can be prepared and stored at 4 C in the dark (since sodium deoxycholate is light sensitive) for a couple of months or until some precipitate is observed. The difference between lysis buffer I and lysis buffer II is the addition of PUGNAc (which is an O-GlcNAcase and b-hexosaminidase inhibitor) to the latter.
CRITICAL: Reagents such as aminoguanidinium chloride and CuSO 4 can be hazardous to human health. Always read the corresponding MSDS before using these reagents and always handle them with care using appropriate personal protective cover (e.g., lab coat, gloves and safety goggles). The 103 CuAAC solution II should be prepared by adding the components in the order of the table.
Note: The differences between CuAAC solution I and CuAAC solution II are the different concentrations of the reagents and the different azide-containing probes used (CF680 picolyl azide in the former and Biotin-DADPS-picolyl azide in the latter).

STEP-BY-STEP METHOD DETAILS
Cloning of BH GalNAc-Ts This step is performed to replace the gatekeeper residues in the active site of the GalNAc-T with alanines and thus generate the BH GalNAc-T. This is performed on both the full-length and truncated constructs of the GalNAc-T, with the two-point mutations introduced in a sequential manner. Pause point: The PCR products can be stored at À20 C before performing the KLD reaction. The KLD reaction products can be stored at À20 C before performing the transformation step.
Note: We recommend using the pDONR221 and pGEN2-DEST vectors for the full-length and truncated constructs respectively which are commercially available in DNASU. We have observed that the mutagenesis efficiency strongly depends on the vector used therefore if alternative constructs are used the mutagenesis protocol may need to be optimized accordingly.
Protein expression and Ni-NTA purification of WT and BH GalNAc-T

Timing: 2 weeks
This step is performed to express purified recombinant His-tagged truncated WT and BH GalNAc-Ts. 8 Buffer. i. Determine the protein concentration. j. Aliquot, flash-freeze in liquid nitrogen and store at À80 C.
i. WT and BH GalNAc-T should be stable for months when stored at À80 C in Freezing Buffer.
Note: We recommend determining the protein concentration by densitometry of Coomassiestained SDS-PAGE gel bands and comparison to known standards of bovine serum albumin since certain components in the enzyme preparation (e.g., imidazole) may interfere with measurements performed by BCA assay or Nanodrop.

In vitro glycosylation experiments
Timing: 2 days This step assesses and compares the enzymatic activities of the WT and BH systems. To ensure successful bump-and-hole engineering, the BH GalNAc-T must remain biochemically competent whilst orthogonal to the WT enzyme. Moreover, to ensure that bump-and-hole engineering does not affect the enzyme's peptide substrate affinity and specificity this experiment should also be carried out with a small panel of synthetic peptides. 13. Determine glycopeptide formation by HPLC-MS and MS+ peak integration ( Figure 6) using Equation 1.
Glycopeptide formationð%Þ = Peak area of product Peak area of starting material + Peak area of product 3 100

(Equation 1)
CRITICAL: Always handle the enzyme on ice and add it to the reaction mixture last.
Optional: A fluorescently-labelled synthetic peptide may be used instead, followed by determination of glycopeptide formation by UV peak integration. However, the chromatography method may need to be optimized to separate the peptide starting material and glycopeptide product peaks. The UDP-Glo TM Glycosyltransferase Assay may also be used as an alternative way of quantifying the activity of the enzymes, however it will not take into account UDP-sugar hydrolysis which may skew the results.
Note: The activity of the enzymes may vary between GalNAc-Ts and enzyme preparations. The conditions of the glycosylation reaction may therefore need to be optimized accordingly.
We have demonstrated that accurate quantification by MS+ peak integration is possible due to comparable ionization efficiencies of the peptide starting material and glycopeptide product. Nevertheless, since ion counts might differ between (glyco-)peptides it is recommended that whenever a new (glyco-)peptide is used that the ion count is compared against the native enzyme-substrate pair to confirm that this is the case. We have previously demonstrated that compound 1 is compatible with several GalNAc-Ts for the BH approach in that orthogonal enzyme activity is observed for compound 1 as a substrate (activity is observed with BH

OPEN ACCESS
GalNAc-T only, with comparable glycopeptide turnover to the native system). 12 However, alternative UDP-GalNAc analogues may be investigated for other GalNAc-T systems.

Michaelis-Menten kinetics
Timing: 2 days This step confirms whether the BH GalNAc-T/compound 1 pair retains the kinetic parameters of its WT GalNAc-T/UDP-GalNAc counterpart. First, the enzyme concentration at which $10%-20% glycopeptide formation is achieved needs to be determined and this value is chosen to fulfill the prerequisites of the Michaelis-Menten kinetics.
14. Set up in vitro glycosylation reactions using different WT and BH GalNAc- 24. Determine the kinetic parameters for the BH GalNAc-T/compound 1 and WT GalNAc-T/UDP-GalNAc pairs. a. We use the non-linear Michaelis-Menten and kcat fitting programs found in GraphPad Prism (https://www.graphpad.com/guides/prism/latest/curve-fitting/reg_michaelis_menten_enzyme. htm).
CRITICAL: Always handle the enzyme on ice and add it to the reaction mixture last. If any of the requirements of Michaelis-Menten kinetics are not fulfilled, the model will be unsuitable. Furthermore, not all enzymes adhere to Michaelis-Menten kinetics. The Michaelis-Menten equation describes a rectangular hyperbola. If the plot of initial rate of reaction against substrate concentration does not display this behavior, alternative kinetic models may be more suitable.
Note: Higher WT and BH GalNAc-T enzyme concentrations can be investigated. The peptide concentration used in the experiment using different UDP-sugar concentrations will depend on the Km of the GalNAc-T with the peptide used. We chose 1.5 h as a convenient endpoint to measure glycopeptide formation at the initial stage of the reaction. Alternative timepoints may be used instead as long as $10% glycopeptide formation is obtained. Please refer to Choi et al. 12 for examples of Michaelis-Menten kinetics plots of WT and BH enzyme-substrate pairs for GalNAc-T1, -T2 and -T10.

In-Fusion Cloning of full-length WT and BH GalNAc-T into pSBbi plasmids
Timing: 1 week This step is performed to clone full-length WT and BH GalNAc-T with a C-terminal VSV-G tag into the backbone of pSBbi-based plasmids containing FLAG-tagged WT/mut-AGX1. Pause point: The purified linearized vector and PCR products can be stored at -20 C prior to the In-Fusion cloning reaction. The cloning reactions can be stored at -20 C prior to the transformation step.

Timing: 2 weeks
This step is performed to establish stable K-562 cell lines using the pSBbi plasmids co-expressing VSV-G-tagged WT/BH GalNAc-T and FLAG-tagged WT/mut-AGX1. These plasmids also contain a hygromycin resistance gene to allow the generation of stable colonies through hygromycin selection ( Figure 1). We co-transfect the pSBbi plasmids with a Sleeping Beauty transposase for stable integration of the plasmid DNA into the genome of the K-562 cells. 36. Plate K-562 cells at 70%-90% confluency in 1.5 mL of growth medium in a 6-well plate. 37    CRITICAL: As the CF680 picolyl azide is light-sensitive, once the CuAAC solution is added to the samples these should be covered with aluminum foil. The Revertä Destaining Solution should not be left for longer than 10 min. Compound 1 has been shown to be accepted by GALE, 1 an epimerase which catalyzes the interconversion of UDP-GalNAc and UDP-GlcNAc in living cells. This means that the alkyne-containing UDP-GlcNAc analogue product of the epimerization reaction may be incorporated into GlcNAc-containing extracellular glycoproteins, such as N-glycans. PNGase F treatment of the cell lysates is therefore crucial to remove N-glycans prior to analysis. This helps discern between N-and O-glycans and reduces background fluorescence.
Note: The cell lysates can be stored at À20 C for short-term storage or at À80 C for longterm storage. We recommend using K-562 cells to assess whether the approach works. Other cell lines may need optimization in constructs, sugar concentration and feeding time. We also recommend using constructs with both AGX1 and GalNAc-T on the same plasmid. Co-transfection or sequential transfection has led to inferior labeling outcomes in the past. This may be particularly important in the case of cell lines that make elaborated O-glycans, as these likely incorporate the GlcNAc derivative into their O-glycans. Many cancer cell lines have short O-GalNAc glycans and are easier to use in this procedure. Analyzing in-gel fluorescence from glycoproteins labeled with the CF680 fluorophore using a laserscanning system has given the best results. Other read-outs e.g., using alternative fluorophores, camera-based imagers or blots to visualize the labelled glycoproteins have led to lower signals. We use an Odyssey CLx Imager and Image Studio software for image acquisition and processing. Optional: An expression vector containing VSV-G-tagged WT/BH GalNAc-T under the control of a Dox-inducible promoter may be used as a control to assess background protein labeling in the absence of the GalNAc-T. GALE-KO cells may be used to prevent the epimerization of the compound 1 to the corresponding UDP-GlcNAc analogue by GALE. PNGase F treatment would not be necessary in this case.

Timing: 2 weeks
This step is performed to prepare peptide and glycopeptide fractions from lysates of MCF7 cells transfected with the BH GalNAc-T system for subsequent MS-proteomics and glycoproteomics.    ii. Use a 1 mL syringe attached to an applicator to push the liquid through the column.
iii. Discard flow-through. iv. Repeat conditioning step. c. Load sample: i. Reconstitute sample in 100 mL of loading buffer.
ii. Load sample on the column. iii. Use a 1 mL syringe attached to an applicator to push the liquid through the column. iv. Discard flow-through. d. Wash column: i. Add 50 mL of wash buffer to the column to wash any traces of salts.
ii. Use a 1 mL syringe attached to an applicator to push the liquid through the column.
iii. Discard flow-through. e. Elute sample: i. Place the UltraMicroSpinä column in a clean 1.5 mL centrifuge tube.
ii. Add 50 mL of elution buffer to the column.
iii. Use a 1 mL syringe attached to an applicator to push the liquid through the column. iv. Collect flow-through. f. Vacuum-dry the eluted sample to remove any traces of organic solvent.
i. Store eluted sample at À80 C.
CRITICAL: From the protein precipitation in methanol step onwards all reagents used should be LC-MS grade. To characterize O-GalNAc glycans specifically, the lysates should be treated with PNGase F prior to the click reaction or GALE KO cells should be used for the experiment. We recommend using Protein LoBindâ Tubes throughout the procedure to minimize sample loss due to protein-surface binding.
Pause point: The whole cell lysates can be stored at -20 C prior to the click reaction. The clicked lysates can also be stored at -20 C prior to the protein precipitation. Combined ll OPEN ACCESS supernatants after protein precipitation and resuspension can be stored at -20 C prior to enrichment. The dried pellets before sample-desalting can be stored at -80 C. The dried pellets after sample-desalting can be stored at À80 C.
Note: The protein expression levels, sugar feeding conditions and click reaction efficiency may vary between cell lines so the experimental conditions might need to be optimized. The cells should be 30%-60% confluent at the time of sugar feeding. We recommend detaching the cells by pipetting using a 1 mL pipette since we have seen that scraping can lead to a lower cell viability.
Optional: When desalting the peptides, the flow-through may be collected after loading the sample and washing the column in case not all the sample is retained in the column. If the secretome is to be analyzed instead of whole cell lysates the procedure is the same with the exception that serum-free medium should be used and two 15 cm dishes should be prepared per sample instead of one. After sugar feeding and incubation for 16-20 h, the secretome can be collected and harvested to remove cellular debris. The samples can then be concentrated to 200 mL using Amicon Ultra-15 Centrifugal Filters (3 kDa MWCO) and the medium exchanged with PBS. Even though we haven't done this ourselves, if one would like to estimate the amount of protein bound to the neutravidin beads during the enrichment step we believe that the AVIDITY assay 34 may be applicable for this purpose. The beads would have to be first incubated with HABA, the supernatant collected and the absorbance of free HABA at 350 nm measured. The supernatant would then be returned to the beads and the samples added. The supernatant would be collected and the absorbance at 350 nm measured again. Since the biotinylated glycoproteins in the samples will displace the HABA bound to the beads, measuring the change in absorbance due to free HABA in the supernatant before and after adding the samples to the beads could be used to infer the amount of proteins bound to the beads.

Timing: 1 day
This step is performed to acquire the raw MS data from the peptide and glycopeptide fractions from lysates of cells transfected with the BH GalNAc-T system. ii. Generate HCD data with the Set the precursor automated gain control (AGC) settings to 3e5 ions and set the isolation window for HCD to 1.6 Da and the collision energy to 30. iii. Enable dynamic exclusion with a repeat count of 3, repeat duration of 10 s, and an exclusion duration of 10 s. iv. MS2 spectra should be generated using an Orbitrap at top speed for 3 s. 164. For glycoproteomic analyses, set up the instrument to acquire data in a dependent fashion using HCD product dependent electron transfer dissociation (HCD-pd-ETD). a. Details on the instrument method are as follows:

Prepare Buffer
i. Full mass spectra should have an MS1 precursor mass resolution set to 60,000 at FWHM 400 m/z, a mass range of 350-1,500 m/z, and sample charge states 2-6. ii. Generate HCD data with the Set the precursor AGC settings to 3e5 ions and set the isolation window for HCD to 1.6 Da and the collision energy to 30. iii. Enable dynamic exclusion with a repeat count of 3, repeat duration of 10 s, and an exclusion duration of 10 s. iv. MS2 spectra should be generated at top speed for 3 s. v. To enable HCD-pd-ETD, select '' Targeted  Note: Due to the labile nature of glycan modifications, we recommend running glycopeptide samples on an Orbitrap Eclipse with ETD (Thermo Fisher) coupled to an UltiMate 3000 RSLCnano. For the peptide samples we perform three 5 mL injections per sample to have technical replicates, whilst for the glycopeptide samples we perform one 15 mL injection instead. Additionally, these settings can vary depending on the HPLC and mass spectrometer being used and should be optimized for different systems.

Timing: 2 days
This step is performed to analyze the raw MS data from the peptide fraction from lysates of cells transfected with the BH GalNAc-T system by label-free quantitative analysis to dissect the protein substrates of the GalNAc-T of interest.
a. Load raw mass spectrometry files. b. Select Set experiment to name each sample. i. If you want to treat your technical replicates as a single experiment, they should have the same name in the Experiment column. c. In the Group-specific parameters section: i. In Type select Standard and Multiplicity 1.
ii. In Modifications include methionine oxidation and N-terminal acetylation as variable modifications and cysteine carbamidomethylation as a fixed modification with a total common max of 5. iii. In Instrument modify the default parameters based on the instrument used. iv. In Digestion mode select Specific and in Enzyme select Trypsin/P. Allow two missed cleavages.
ii. In First group (right) and Second group (left) select the two samples to be analyzed.
iii. In Test select Welch's T-test. iv. In Use for truncation select p-value. v. Select OK. j. Use the Scatter plot function to visualize the data produced.
i. On the x-axis select Welch's T-test difference.
ii. On the y-axis select -Log Welch's T-test p-value.
iii. Hits will be considered if they have a -log Welch T-test p-value greater than 1.3 and a Welch T-test difference greater than 3 (i.e., eightfold enrichment).
Note: If cells from species other than human are used, the FASTA file from the corresponding species should be added instead when performing the database search with MaxQuant e.g., if murine cells are used the Mus musculus FASTA file should be uploaded instead. The avidin FASTA file can also be downloaded from Uniprot and uploaded to MaxQuant to identify any contaminants in the samples from the neutravidin beads due to the enrichment process.
Mass spectrometry data analysis of glycopeptides using Byonic

Timing: 1 week
This step is performed to analyze the raw MS data from the glycopeptide fraction from lysates of cells transfected with the BH GalNAc-T system to identify GalNAc-T-specific glycosylation sites.
168. Search your raw files with 10 ppm mass tolerance for precursor mass ions, with 20 ppm and 0.2 Da fragment mass tolerances for HCD and ETD fragmentation, respectively. 169. Allow up to two missed cleavages per peptide and semi-specific, C-terminal tryptic digestion (R,K Cleavage sites). a. Use a 1% false discovery rate using standard reverse-decoy techniques. 170. Methionine oxidation (common 1) and asparagine deamidation (common 1) should be set as variable modifications with a total common max of 2, rare max of 1. a. Carbamidomethyl should be set as a fixed modification. 171. Under the ''Advanced'' tab in the Byonic interface, select ''Create focused database'' and start the search. a. This will create a FASTA file containing only the proteins that Byonic finds in the sample, which can then be used to reduce search times with glycans added. 172. Edit the existing Byonic parameter file by selecting the focused database FASTA file. 173. Under the ''Glycans'' tab, add the following modifications ( Table 2) a. Manually examine the HCD spectra of any remaining glycopeptides in the excel file. b. Each HCD spectrum with a GalN6yne modified peptide should contain an oxonium ion at 491 and another at 330. c. Additionally, the peptide should have at least 50% of the naked (i.e., non-glycosylated) b/y ions annotated. d. If the peptide meets these criteria, then move on to the next step, otherwise the peptide is not correctly annotated and should not be counted as a modified glycopeptide. 177. Extract the associated chromatograph in Qual Browser along with the associated ETD MS2 spectra.

OPEN ACCESS
178. Average the MS2 spectra generated and de novo sequence the glycopeptide to identify which site is modified (Figure 7). a. A detailed tutorial for manual interpretation of ETD spectra is available here. 35 Note: For a more detailed explanation on mass spectrometry analysis using Byonic please refer to Malaker et al. 36 We use Byonic to analyse the data from the glycopeptide fraction but alternative software can be used instead.

EXPECTED OUTCOMES
After secreted protein expression and Ni-NTA purification of WT and BH GalNAc-T, SDS-PAGE gels can be run to evaluate the success of the expression and purification of the enzymes (Figure 8). A band at the expected molecular weight of the protein should be seen in the fractions, with most of the purified protein observed in the eluted fractions. Ideally the eluted fraction chosen to be used in downstream applications should contain as few impurities as possible, as these could interfere with the activity of the enzyme.
When evaluating the activity of the purified enzymes by in vitro glycosylation experiments, the WT enzyme should show activity with its native substrate (UDP-GalNAc) exclusively, whereas the BH GalNAc-T should only show activity with the corresponding ''bumped'' analogue (compound 1).
When performing the in vitro glycosylation experiments with a small panel of synthetic peptides, the glycosylation profile of the BH GalNAc-T should match that of the corresponding WT GalNAc-T in terms of peptide substrate specificity and glycopeptide formation.
If the cell surface labeling experiments have been successful, labeling should be observed in the cells fed with Ac 4 ManNAl whilst no labeling should be seen in cells fed with DMSO. In cells fed with the compound 2, the strongest labeling should be observed when mut-AGX1 and BH GalNAc-T are present in the cell (Figure 9). Any background labeling due to GlcNAc-containing glycoproteins should be removed upon treatment with PNGase F, in particular in the cells fed with Ac 4 ManNAl as the sugar is a N-acetylneuraminic acid (Neu5Ac) precursor and Neu5Ac is known to typically cap N-glycans. Different GalNAc-Ts should produce slightly different band patterns, reflecting the different protein substrate specificities of the different isoenzymes.

In vitro glycosylation experiments
At least three independent replicates should be performed for each experiment. Each independent replicate should consist of two technical replicates for each UDP-sugar/peptide used.

OPEN ACCESS
An average value of glycopeptide formation should be calculated using the two technical replicates for each UDP-sugar/peptide used. The final values should be reported as the mean glycopeptide formation across the three independent replicates G standard deviation.

Michaelis-Menten kinetics
For the in vitro glycosylation experiments using different enzyme concentrations: At least three independent replicates should be performed for each experiment. Each independent replicate should consist of two technical replicates for each enzyme concentration used.
An average value of glycopeptide formation should be calculated using the two technical replicates for each enzyme concentration used. The final values should be reported as the mean glycopeptide formation across the three independent replicates G standard deviation.
For the in vitro glycosylation experiments using different UDP-sugar concentrations: At least three independent replicates should be performed for each experiment. Each independent replicate should consist of two technical replicates for each UDP-sugar concentration used.
An average value of glycopeptide formation should be calculated using the two technical replicates for each UDP-sugar concentration used. This average glycopeptide formation should be divided by the duration of the reaction (5400 s) to calculate the initial rate of reaction at each substrate concentration. A mean value of initial rate of reaction should then be calculated for each UDP-sugar concentration using the results of the three independent replicates.
The mean initial rate of reaction should then be plotted against UDP-sugar concentration, with error bars representing the standard deviation. The kinetics parameters can then be calculated as explained above.

LIMITATIONS
The BH GalNAc-T will be in competition for protein substrates and glycosylation sites with the corresponding WT GalNAc-T. This may lead to misleading glycoproteomic analyses. This limitation might be overcome by knocking out the endogenous gene encoding the GalNAc-T of interest by CRISPR-Cas9 followed by stable transfection of the BH GalNAc-T. Alternatively, if compensatory mechanisms take place between the knockout of the endogenous GalNAc-T and the transfection of the BH GalNAc-T, homology-directed repair (HDR) editing can be performed instead to mutate the native allele to the corresponding BH mutant.
The sugar modification may have functional consequences such as affecting antibody/ glycan-binding protein recognition, protein-protein interactions, protein stability and trafficking. This means that this system may have limited use in determining the biological function of glycosylation on its protein substrates.
Similarly, if there are significant differences between the kinetic parameters of the WT GalNAc-T/ UDP-GalNAc and BH GalNAc-T/compound 1 pairs, this may have biological consequences in the cell. Nevertheless, the BH enzyme-substrate pairs investigated to date have shown comparable kinetic parameters to their native counterparts.
When the cells are transfected with pSBbi plasmids, the BH GalNAc-T will be overexpressed compared to the corresponding endogenous enzyme. The original publication 1 demonstrated that BH GalNAc-Ts do not introduce new glycosylation sites in their protein substrates so no false-positive hits should be seen even if the BH GalNAc-T is overexpressed. Nevertheless, an ideal discovery tool would assess native glycosylation levels. This limitation can be overcome by using Dox-inducible promoters to fine-tune the expression levels of the BH GalNAc-T.
On a similar note, we have seen that overexpression of GalNAc-Ts results in reduced sialylation of Nand O-glycans, suggesting an alteration in the cellular glycome. 3 This observation shouldn't impair the overall aim of identifying the protein substrates of a specific GalNAc-T but it means that this approach doesn't fully replicate the native biological system.
While O-GalNAc glycosylation is found in many types of glycoproteins, it is abundantly present in mucins. However, mucins are challenging to study by MS due to their size, their protease resistance, their dense glycosylation and poor ionizability. 37 Consequently, this can mean that even if we can successfully label the protein substrates of our GalNAc-T of interest with the modified sugar, we still may not be able to detect and characterize these glycoproteins by glycoproteomics. Treatment of the cell lysates with the mucin-selective protease StcE can facilitate this process by breaking the mucins down into smaller fragments which are more amenable to study by MS. 37 Recent advances in mucin-selective enrichment strategies and in MS fragmentation methods can also aid in mucin glycoproteomics. 38

Potential solution
The PCR conditions may require optimization, such as performing a gradient PCR to find the most appropriate annealing temperature, increasing the denaturation/extension time and/or increasing the number of cycles (although increasing the number of cycles over 35 may be counterproductive as the dNTPs get depleted and this may result in premature stops and truncated products). Alternatively, a small amount of DMSO (e.g., 1%-4%) can be added to the PCR reaction mixture to help lower the melting temperature of the DNA template -this can be particularly helpful for DNA templates with a high GC content.

Problem 2
Failure to obtain colonies of the transformed bacteria following the PCR reaction and KLD treatment (step 1).

Potential solution
The PCR/KLD products may be purified and concentrated prior to the transformation step. We use the NucleoSpinâ Gel and PCR Clean-up Kit according to the manufacturer's instructions for this purpose (https://www.takarabio.com/documents/User%20Manual/NucleoSpin%20Gel% 20and%20PCR%20Clean/NucleoSpin%20Gel%20and%20PCR%20Clean-up%20User%20Manual_ Rev_04.pdf). The KLD reaction time may also be extended up to 30 min at 37 C to improve the efficiency of the reaction.

Problem 3
No expression of recombinant WT and BH GalNAc-Ts (step 8).

Potential solution
The protein expression and purification strategy may require optimization, such as using alternative constructs, using different tags/protein purification strategies and using different expression systems.

Problem 4
No in-gel fluorescence signal observed in the cell labeling experiments (step 64).

Potential solution
The efficacy of the CuAAC reaction may vary depending on the source of the reagents, therefore the concentration of the CuAAC reaction components and the duration of the reaction may need to be optimized.

Problem 5
Labeling seen with WT GalNAc-T in the cell labeling experiments (even following PNGase F treatment of the cell lysates) (step 64).

Potential solution
This may occur due to saturation of the labeling signal when a high sugar concentration is used. Reducing the sugar concentration, for example to 1 mM, should result in labeling only when BH GalNAc-T and mut-AGX1 are present in the cell. Any labeling signal still observed after reducing the sugar concentration may be due to lingering N-glycans as a result of reduced N-glycan cleavage by PNGase F after the click reaction. If this occurs GALE KO cells should be used instead to prevent the epimerization of compound 1 into the corresponding UDP-GlcNAc analogue and subsequent incorporation into GlcNAc-containing glycoproteins.

Problem 6
No hits obtained after performing the glycoproteomics analysis on the glycopeptide fraction (step 175).

Potential solution
The sample preparation may require optimization, such as using different sample clean up techniques after the CuAAC reaction, using endoglycosidases and alternative/additional proteases to Lys-C and trypsin and modifying the MS method. Nevertheless, the peptide fraction should be indicative of whether the GalNAc-T of interest has specific peptide substrates.

RESOURCE AVAILABILITY
Lead contact Further information and requests for renewable resources and reagents should be directed to and will be fulfilled by the lead contact, Benjamin Schumann, b.schumann@imperial.ac.uk. For nonrenewable resources such as synthetic compounds, we will fulfill requests to the best of our abilities.